Conversation
|
Can we add a description to the PRs so that we also iterate on what description we want to present upstream? Description should contain a similar format as the template we have on internal PRs, such as brief description, testing, whether or not this is a breaking change... |
|
Could we also have all changes be in one commit? it is slightly confusing to go over commits and see one function added and then removed, we would eliminate the confusion by having one commit with all the changes. |
f6293b2 to
ce6ffbf
Compare
Merged the commits |
| dt = DeltaTable(table_path) | ||
| actions, current_version = dt.peek_next_commit(version=version) | ||
| assert (len(actions), current_version) == expected | ||
|
|
There was a problem hiding this comment.
maybe we could add a test to cover DeltaLogNotFound.
There was a problem hiding this comment.
I guess there's no way to test the loop? I'm thinking. dt.peek_next_commit(version) where version does not have an object but version+1 does and _latest_version is currently version.
There was a problem hiding this comment.
I'm a little confused here, do you mean use version=-1 as input or something else?
There was a problem hiding this comment.
I was thinking there'd be a unit test in which the underlying delta table has actual latest version 10, the DeltaTable thinks the latest version is 6, and version 7 doesn't actually exist. So we call dt.peek_next_commit(6) and expect the next commit to be 8, which requires
- realizing
7is bigger than the thought-to-be latest version6 - updating the latest version which is now
10 - checking if version
7is there, which it isn't - trying again with version
8and finding version8to return.
There was a problem hiding this comment.
Added data that missed a commit and a test for it. get_latest_version function can get the actual latest version when there is a missed commit (e.g., if version 7 doesn't exist and the actual latest version is 10, DeltaTable.get_latest_version will return 10), so we don't need to update the latest version.
|
I ran --- Build Python binding --- error: future cannot be sent between threads safelyeltalake-core error[E0624]: associated function error: aborting due to 3 previous errors 413/416: deltalake-core Some errors have detailed explanations: E0432, E0624.talake-core For more information about an error, try error: could not compile They seems not relate to our change, I got same errors when I ran |
90d7153 to
91754aa
Compare
|
ACTION NEEDED delta-rs follows the Conventional Commits specification for release automation. The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. |
| }) => { | ||
| return Err(DeltaTableError::DeltaLogNotFound(current_version)); | ||
| } | ||
| Err(err) => Err(err), |
There was a problem hiding this comment.
Why return Ok(Err(err)) instead of Err(err)?
There was a problem hiding this comment.
Sorry, where is Ok(Err(err))?
Here we use the same method as peek_next_commit below to get the commit_log_bytest, then we return the full commit_log_bytes.
There was a problem hiding this comment.
In this line, we assign commit_log_bytes to Err(err). Then on line 470, we return Ok(commit_log_bytes). I believe as a result we will return Ok(Err(err)) if we reach line 468.
Am I interpreting this correctly? If so, this seems wrong. I think you'd want to return Err(err) here and on line 462 you'd want to assign commit_log_bytes to bytes, not Ok(bytes) (which would thus return Ok(Ok(bytes)) which is weird as well).
There was a problem hiding this comment.
yes seems weird we need to fix this, also need to add a test for error case.
There was a problem hiding this comment.
as mentioned by ohan, rust ? seems to tidy the results, https://doc.rust-lang.org/rust-by-example/std/result/question_mark.html
There was a problem hiding this comment.
Flat out missed the question mark at the end, ok great.
python/deltalake/table.py
Outdated
| next_version += 1 | ||
| else: | ||
| raise | ||
| logging.info(f"Provided Delta Version is up to date. Version: {version}") |
There was a problem hiding this comment.
Remove this log line, I don't think it's necessary. Also removes the import statement above.
| dt = DeltaTable(table_path) | ||
| actions, current_version = dt.peek_next_commit(version=version) | ||
| assert (len(actions), current_version) == expected | ||
|
|
There was a problem hiding this comment.
I guess there's no way to test the loop? I'm thinking. dt.peek_next_commit(version) where version does not have an object but version+1 does and _latest_version is currently version.
python/tests/test_table_read.py
Outdated
| def test_delta_log_not_found(): | ||
| table_path = "../crates/deltalake-core/tests/data/simple_table" | ||
| dt = DeltaTable(table_path) | ||
| latest_version = dt._table.get_latest_version() |
There was a problem hiding this comment.
Stylistically, this is access to a private member _table. Is that OK or do we need to promote get_latest_version() to also be a public member function of DeltaTable?
There was a problem hiding this comment.
Added get_latest_version() as a public member.
|
Should we have a GitHub issue in github/delta-io/delta-rs/issues for this feature request? |
|
@PengLiVectra regarding the errors about |
|
do we know why some tests are failing? |
f392256 to
167e954
Compare
359306a to
b402f3b
Compare
b402f3b to
005d52a
Compare
Description
Expose peek_next_commit to python. Can be used for streaming delta commit changes.
peek_next_commitwill return actions in the commit and commit version of next commit. If current version is the latest version, it will return None (actions will be None) and the current version.An example of usage:
actions, next_version = delta_table.peek_next_commit(current_version)
If current_version is the latest version, the return will be like None, current_version.
Related Issue(s)
delta-io#1886
Testing
Added unit tests that passed.
Breaking-Change
Not a breaking change.
Documentation